- Title
- Manipulating the alpha level cannot cure significance testing
- Creator
- Trafimow, David; Amrhein, Valentin; Chaigneau, Sergio E.; Ciocca, Daniel R.; Correa, Juan C.; Cousineau, Denis; de Boer, Michiel R.; Dhar, Subhra S.; Dolgov, Igor; Gómez-Benito, Juana; Grendar, Marian; Grice, James W.; Areshenkoff, Corson N.; Guerrero-Gimenez, Martin E.; Gutiérrez, Andrés; Huedo-Medina, Tania B.; Jaffe, Klaus; Janyan, Armina; Karimnezhad, Ali; Korner-Nievergelt, Fränzi; Kosugi, Koji; Lachmair, Martin; Ledesma, Rubén D.; Barrera-Causil, Carlos J.; Limongi, Roberto; Liuzza, Marco T.; Lombardo, Rosaria; Marks, Michael J.; Meinlschmidt, Gunther; Nalborczyk, Ladislas; Nguyen, Hung T.; Ospina, Raydonal; Perezgonzalez, Jose D.; Pfister, Roland; Beh, Eric J.; Rahona, Juan J.; Rodríguez-Medina, David A.; Romão, Xavier; Ruiz-Fernández, Susana; Suarez, Isabel; Tegethoff, Marion; Tejo, Mauricio; van de Schoot, Rens; Vankov, Ivan I.; Velasco-Forero, Santiago; Bilgiç, Yusuf K.; Wang, Tonghui; Yamada, Yuki; Zoppino, Felipe C. M.; Marmolejo-Ramos, Fernando; Bono, Roser; Bradley, Michael T.; Briggs, William M.; Cepeda-Freyre, Héctor A.
- Relation
- Frontiers in Psychology Vol. 9, no. 699
- Publisher Link
- http://dx.doi.org/10.3389/fpsyg.2018.00699
- Publisher
- Frontiers Research Foundation
- Resource Type
- journal article
- Date
- 2018
- Description
- We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.
- Subject
- statistical significance; null hypothesis testing; p-value; significance testing; decision making
- Identifier
- http://hdl.handle.net/1959.13/1386473
- Identifier
- uon:32424
- Identifier
- ISSN:1664-1078
- Rights
- © 2018 Trafimow, Amrhein, Areshenkoff, Barrera-Causil, Beh, Bilgiç, Bono, Bradley, Briggs, Cepeda-Freyre, Chaigneau, Ciocca, Correa, Cousineau, de Boer, Dhar, Dolgov, Gómez-Benito, Grendar, Grice, Guerrero-Gimenez, Gutiérrez, Huedo-Medina, Jaffe, Janyan, Karimnezhad, Korner-Nievergelt, Kosugi, Lachmair, Ledesma, Limongi, Liuzza, Lombardo, Marks, Meinlschmidt, Nalborczyk, Nguyen, Ospina, Perezgonzalez, Pfister, Rahona, Rodríguez-Medina, Romão, Ruiz-Fernández, Suarez, Tegethoff, Tejo, van de Schoot, Vankov, Velasco-Forero, Wang, Yamada, Zoppino and Marmolejo-Ramos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
- Language
- eng
- Full Text
- Reviewed
- Hits: 29961
- Visitors: 30019
- Downloads: 291
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details Download | ATTACHMENT02 | Publisher version (open access) | 184 KB | Adobe Acrobat PDF | View Details Download |